{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "668a88ba",
   "metadata": {},
   "source": [
    "# Huggingface Sagemaker-sdk - Deploy 🤗 Transformers for inference\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "27bbe667",
   "metadata": {},
   "source": [
    "Welcome to this getting started guide, we will use the new Hugging Face Inference DLCs and Amazon SageMaker Python SDK to deploy a transformer model for inference. \n",
    "In this example we deploy a trained Hugging Face Transformer model on to SageMaker for inference."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6e0be335",
   "metadata": {},
   "source": [
    "## API - [SageMaker Hugging Face Inference Toolkit](https://github.com/aws/sagemaker-huggingface-inference-toolkit)\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7c6639cd",
   "metadata": {},
   "source": [
    "Using the `transformers pipelines`, we designed an API, which makes it easy for you to benefit from all `pipelines` features. The API is oriented at the API of the [🤗  Accelerated Inference API](https://api-inference.huggingface.co/docs/python/html/detailed_parameters.html), meaning your inputs need to be defined in the `inputs` key and if you want additional supported `pipelines` parameters you can add them in the `parameters` key. Below you can find examples for requests. \n",
    "\n",
    "**text-classification request body**\n",
    "```python\n",
    "{\n",
    "\t\"inputs\": \"Camera - You are awarded a SiPix Digital Camera! call 09061221066 fromm landline. Delivery within 28 days.\"\n",
    "}\n",
    "```\n",
    "**question-answering request body**\n",
    "```python\n",
    "{\n",
    "\t\"inputs\": {\n",
    "\t\t\"question\": \"What is used for inference?\",\n",
    "\t\t\"context\": \"My Name is Philipp and I live in Nuremberg. This model is used with sagemaker for inference.\"\n",
    "\t}\n",
    "}\n",
    "```\n",
    "**zero-shot classification request body**\n",
    "```python\n",
    "{\n",
    "\t\"inputs\": \"Hi, I recently bought a device from your company but it is not working as advertised and I would like to get reimbursed!\",\n",
    "\t\"parameters\": {\n",
    "\t\t\"candidate_labels\": [\n",
    "\t\t\t\"refund\",\n",
    "\t\t\t\"legal\",\n",
    "\t\t\t\"faq\"\n",
    "\t\t]\n",
    "\t}\n",
    "}\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "c5fee1a3",
   "metadata": {},
   "outputs": [],
   "source": [
    "!pip install \"sagemaker>=2.48.0\" --upgrade"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f6cb886d",
   "metadata": {},
   "source": [
    "## Deploy a Hugging Face Transformer model from S3 to SageMaker for inference\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e91797b9",
   "metadata": {},
   "source": [
    "There are two ways on how you can deploy you SageMaker trained Hugging Face model from S3. You can either deploy it after your training is finished or you can deploy it later using the `model_data` pointing to you saved model on s3.\n",
    "\n",
    "### Deploy the model directly after training\n",
    "\n",
    "If you deploy you model directly after training you need to make sure that all required files are saved in your training script, including the Tokenizer and the Model. \n",
    "```python\n",
    "from sagemaker.huggingface import HuggingFace\n",
    "\n",
    "############ pseudo code start ############\n",
    "\n",
    "# create HuggingFace estimator for running training\n",
    "huggingface_estimator = HuggingFace(....)\n",
    "\n",
    "# starting the train job with our uploaded datasets as input\n",
    "huggingface_estimator.fit(...)\n",
    "\n",
    "############ pseudo code end ############\n",
    "\n",
    "# deploy model to SageMaker Inference\n",
    "predictor = huggingface_estimator.deploy(initial_instance_count=1, instance_type=\"ml.m5.xlarge\")\n",
    "\n",
    "# example request, you always need to define \"inputs\"\n",
    "data = {\n",
    "   \"inputs\": \"Camera - You are awarded a SiPix Digital Camera! call 09061221066 fromm landline. Delivery within 28 days.\"\n",
    "}\n",
    "\n",
    "# request\n",
    "predictor.predict(data)\n",
    "\n",
    "```\n",
    "\n",
    "### Deploy the model using `model_data`"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "d20a02bd-5268-4630-ba9b-fea7dd06b019",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "sagemaker role arn: arn:aws:iam::558105141721:role/sagemaker_execution_role\n"
     ]
    }
   ],
   "source": [
    "import sagemaker\n",
    "import boto3\n",
    "\n",
    "try:\n",
    "    role = sagemaker.get_execution_role()\n",
    "except ValueError:\n",
    "    iam = boto3.client('iam')\n",
    "    role = iam.get_role(RoleName='sagemaker_execution_role')['Role']['Arn']\n",
    "\n",
    "print(f\"sagemaker role arn: {role}\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "fa2fedd1",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "from sagemaker.huggingface import HuggingFaceModel\n",
    "\n",
    "# create Hugging Face Model Class\n",
    "huggingface_model = HuggingFaceModel(\n",
    "   model_data=\"s3://hf-sagemaker-inference/model.tar.gz\",  # path to your trained sagemaker model\n",
    "   role=role, # iam role with permissions to create an Endpoint\n",
    "   transformers_version=\"4.26\", # transformers version used\n",
    "   pytorch_version=\"1.13\", # pytorch version used\n",
    "   py_version=\"py39\", # python version of the DLC\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "73d62cac",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "-----!"
     ]
    }
   ],
   "source": [
    "# deploy model to SageMaker Inference\n",
    "predictor = huggingface_model.deploy(\n",
    "   initial_instance_count=1,\n",
    "   instance_type=\"ml.m5.xlarge\"\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "27814d6b",
   "metadata": {
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[{'label': 'POSITIVE', 'score': 0.9996660947799683}]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# example request, you always need to define \"inputs\"\n",
    "data = {\n",
    "   \"inputs\": \"The new Hugging Face SageMaker DLC makes it super easy to deploy models in production. I love it!\"\n",
    "}\n",
    "\n",
    "# request\n",
    "predictor.predict(data)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "8a291719",
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# delete endpoint\n",
    "predictor.delete_model()\n",
    "predictor.delete_endpoint()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "f652f37f-e4f1-49bc-904a-33f10e471dc8",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "c281c456f1b8161c8906f4af2c08ed2c40c50136979eaae69688b01f70e9f4a9"
  },
  "kernelspec": {
   "display_name": "conda_pytorch_p39",
   "language": "python",
   "name": "conda_pytorch_p39"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.15"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
